- Short: Mathffp.library FPU speedup patch 1.5b
- Author: Jess Sosnoski (at the below address!!!)
- Uploader: starblaz@postoffice.ptd.net
- Version: 1.4beta
- Type: util/boot
- Requires: kick 2.04+?, an 020+, and a 68881/2 FPU.
- Long:
- FFPpatch 1.5beta © 1997 Jess Sosnoski
- Tieee function © 1995 Martin Berndt
- -----------
- This is a program that patches some functions of the mathffp.library
- to use 68881/2 instructions, thus squeezing out a bit more speed.
- (I hope!)
- This is also the first speedup patch I ever attempted to write! :)
- Works on an '040 or '060 too!
- (does not use any FPU trig functions)
- ----------
- Use this program at your own risk!
- I assume no responsibility or liability for problem(s) and/or damage(s)
- that occur by the use, modification, and/or existence of this product,
- and/or its parts in any form.
- But if it don't work right, or slows things down, don't be afraid
- to send me email...or even send some email if it makes your system
- outprocess a cray (yeah, right)
- Anywayz tho...the following functions are patched, and here's the
- speed differences I got with a test program I wrote:
- *NOTE* The test program must call float() a lot, because when I re-enabled
- the SPFloat() patch, the test program ran one hell of a lot faster.
- I've included the results before I re-enabled the float() patch so you
- can see what differences there were.
- Function PATCHED Not_Patched
- -------------------------------------------
- SPAbs 2.21 12.02
- SPNeg 2.16 12.20
- SPAdd 5.18 18.91
- SPSub 5.17 17.78
- SPMul 5.18 20.70
- SPDiv 5.26 20.96
- SPFloat 2.13 11.94
- SPFix 2.96 14.26
- SPFloor 2.16 15.13
- SPCeil 4.75 27.41
- (SPTst and SPCmp are also patched)
- Hmmm....very interesting.....that's why this patch is a beta :)
- Your results will likely differ.
- (The testprogram isn't the best in the world)
- ------------
- Copy it to your c: directory or wherever you like.
- You can add the line run <>NIL: ffppatch to your startup-sequence, or
- user-startup. You can give it an icon and put it in WBstartup.
- You can basically put it anywhere :)
- (I have mine a little after setpatch in my startup-sequence)
- -----
- run <>NIL: ffppatch
- ------
- None...as I didn't feel like figuring out how to to text output
- in assembly.
- Although, if you don't have the right versions of the required libraries,
- it will exit with a returncode of 20.
- -------
- sorry....once it's in...it stays in!
- (didn't I hear Al Bundy say that to Peg once...hmmm....)
- -------
- This patch is experimental, so, don't expect any miracles.
- Opens mathffp.library, and never closes it.
- Also...not a whole hell of a lot of programs use mathffp.library, but those
- which do may benefit from a little speedup.
- (although some garshneblankers, the akJFIF, akLJPG, and akPNG
- datatypes use it)
- There exists a 68881 card for 68000 owners, but, as I found out via
- email, does not access the FPU in the same way that an 020+ would.
- I did not include any special code to support this, so it is likely that
- this patch may not do anything at all on that type of setup.
- (Although, if someone would like to let me know if it does...email me!)
- -----
- mathffp.library is in your Kickstart ROM...you won't find it in libs.
- I've noticed that 040 owners are getting horrible results with
- a few functions...I'll have to figure something out about it....hmmm...
- -------
- Here it is! Just type ffptest and it will dump out the results.
- It disables multitasking while running. It does actually give results
- with an 040 or 060. (though the numbers are quite small)
- The best thing you can do is go into the bootshell, with
- ALL CPU CACHES OFF--leaving them on can give different results with or
- without ffppatch when run multiple times (I found this out the hard way)
- (trust me, you WILL see a difference)
- Note: the testprogram is written in PCQ pascal and may be a bit dodgy.
- ffptest
- run <>NIL: ffppatch
- ffptest
- Then you will see a difference.
- -------
- 1.0 First Release
- 1.1 Now closes mathffp.library when mathtrans.lib v40 can't be opened
- Optimized spmul, spdiv, spabs, spneg, spflt, spfloor and squeezed
- a couple more clock cycles out.
- (spadd and spsub optimized too, but disabled cuz they were
- slower for some reason :( )
- Removed SPFix patch...it was slower :(
- 1.2 Due to my rewriting of ffptest (included), I was able to more
- accurately test speed of calls to mathffp.library.
- I found out everything went faster in the first place :)
- Re-enabled everything.
- 1.3 Contacted Martin Berndt, author of fmath40x.lha, and asked
- about the Fieee and Tieee functions....so, he emailed me the
- source...and I removed the mathtrans.library requirement and
- put the functions directly into the patch program, making it
- faster :)
- 1.4 Longword aligned all patches.
- Changed spabs, spneg and spflt functions to not use fpu...
- they didn't have to do all that work, and were probably horribly
- slower before.
- (worst case..they're probably the same speed as the stock library now
- ..but, one of them can likely be inlined directly into the
- jumptable in some future version to save a clock cycle or two)
- Changed the SPFieee function to a faster one.
- Changed the SPFix function to something more like the original.
- 1.5 Inlined the SPFieee function whereever possible shaving off more
- clock cycles.
- Included Dave Jones' optimized SPTst and SPCmp functions!
- ------
- Make this doc file a bit more presentable, and maybe leave it as plain
- text without Amigaguide OR HTML just for kicks.
- Write a more accurate test program, and maybe include it in the archive.
- Make a GOOD ffptest speedtest program. (in assembly)
- Maybe add command line-arguments to turn on/off selected patches.
- Make the patch exit, instead of hanging around.
- One word: APATCH (if this patch turns out to work as I'd like it to)
- ---------
- Martin Berndt...for sending me the sourcecode to Tieee and Fieee
- functions from his fmath40x mathtrans.library.
- Adam "DC1" Polosnik for ideas, help with sourcecode, and APATCH!
- Dave "Termy" Jones, for help, ideas, optimized functions, and StreamLineOS 2!
- Everyone who sent me emails, praise, and complaints....your input
- was greatly appreciated!!!!
- ----
- Hmmm...what would Tom say, ohyeah, um....possibly.
- Crummy speedtest program.
- ------
- Jess Sosnoski
- 651 Hillside Drive
- Mount Carmel, PA 17851-2463
- starblaz@postoffice.ptd.net
- http://home.ptd.net/~starblaz
- IRC nick: starblaz
- On: galaxynet (#amichat), beyondirc (#styx, #amirc), dalnet (#nin ,#c-64)
- Emails, gifts, money, food, cigarettes, Amiga4060T's will all be
- gladly accepted.
